Set up file structure

load in different datasets

checking against Detailed Methods that all species have correct docs


we use this data to answer

1) what are the characteristics of partnership for species that are precluded from listing?

a. How many partners are there?
    1. Total number of partners
## [1] 197
    1. Histogram of # partners/species
##  2  3  4  5  6  8  9 10 11 13 14 19 20 21 22 24 25 26 27 31 32 33 34 38 41 
##  2  1  3  2  2  7  1 20  4  2 11 10  4  6  1 17  3  7 11  9  2  7  1  2  3 
## 43 44 46 47 49 50 52 54 59 60 61 63 66 67 68 69 70 75 77 80 81 82 84 86 87 
##  7  1  8  5  3 51  6  2  2 12  4  2  3  6  7  3  4  2 13  3  3 10  1  3  7 
## 88 90 
##  5  2
##                                     common_name TotalPartners
## 2                               GOODING'S ONION             2
## 3                            PACKARDS MILKVETCH             1
## 4     ARCTIC GRAYLING- UPPER MISSOURI RIVER DPS             3
## 5                               ARIZONA BUGBANE             2
## 6                                ASHLAND LUPINE             2
## 8                            BEAVER CAVE BEETLE             7
## 9                           BLUE DIAMOND CHOLLA             1
## 10                             BRAND'S PHACELIA            20
## 11               CAMP SHELBY BURROWING CRAYFISH             4
## 13                       CHRIST\xcdS PAINTBRUSH             2
## 14                            CLOKEY'S EGGVETCH            11
## 19                      COPPERBELLY WATER SNAKE            10
## 20           CORAL PINK SAND DUNES TIGER BEETLE             4
## 21                            COW HEAD TUI CHUB             6
## 22                       CUYAMAC LAKE DOWNINGIA             1
## 24                       DUNES SAGEBRUSH LIZARD            17
## 25                             EAGLE LAKE TROUT             3
## 26              ELONGATE MUD MEADOW SPRINGSNAIL             7
## 27                    FLAT-TAILED HORNED LIZARD            11
## 31                                GEORGIA ASTER             9
## 32                        GOOSE CREEK MILKVETCH             2
## 33                           GRAHAM BEARDTONGUE             7
## 34                    GREATER ADAMS CAVE BEETLE             1
## 38                         HENDERSON'S HORKELIA             2
## 41                         KAIBAB PLAINS CACTUS             3
## 43                                   LEAST CHUB             7
## 44                     LESSER ADAMS CAVE BEETLE             1
## 46                  MCCLOUD RIVER REDBAND TROUT             8
## 47                               MERCED CLARKIA             5
## 49 NEVARES SPRING NAUCORID BUG (=FURNACE CREEK)             3
## 50                       NEW ENGLAND COTTONTAIL            51
## 52                           ORCUTTS'S HAZARDIA             6
## 54                             PAGE SPRINGSNAIL             2
## 59                          RELICT LEOPARD FROG             2
## 60                   RIO GRANDE CUTTHROAT TROUT            12
## 61   SACRAMENTO MOUNTAINS CHECKERSPOT BUTTERFLY             4
## 63              SAN FERNANDO VALLEY SPINEFLOWER             2
## 66                         SHORT-LEAVED DUDLEYA             3
## 67                           SICKLEFIN REDHORSE             6
## 68                        SLICKSPOT PEPPERGRASS             7
## 69                    SOLDIER MEADOW CINQUEFOIL             3
## 70               SOUTHERN IDAHO GROUND SQUIRREL             4
## 75                       SURPRISING CAVE BEETLE             2
## 77                           TAHOE YELLOW CRESS            13
## 80                         UMPQUA MARIPOSA LILY             3
## 81        WARM SPRINGS ZAITZEVIAN RIFFLE BEETLE             3
## 82                   WASHINGTON GROUND SQUIRREL            10
## 84                                    WEKIU BUG             1
## 86                        WET CANYON TALUSSNAIL             3
## 87                      WHITE RIVER BEARDTONGUE             7
## 88                      WONDERLAND ALICE-FLOWER             5
## 90                       YADKIN RIVER GOLDENROD             2

    1. Max, min, median, and 1st and 3rd quantiles across species
summary(PartnerData$TotalPartners)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   2.000   3.500   6.019   7.000  51.000
b. How does number of partners relate to species/recovery programs characteristics?
  • i) Thesis tested: Area, %publicland, taxa (1 = flowering plant), employment, total number of threats (a few other predictors looked at and removed due to high correlation were human footprint, threats = habitat as threat (1/0 as y/n))

  • ii) Test performing regression of form • log(# of partners) = B1(taxa) + B2(%public land) + B3(area) + B4(sum of relevant employment sectors) + B5(total number of threats)

lm2 <- lm(log(total) ~ taxa + percentpublic + area_x + nsumemploy + total_x_x, data=RegData)
summary(lm2)
## 
## Call:
## lm(formula = log(total) ~ taxa + percentpublic + area_x + nsumemploy + 
##     total_x_x, data = RegData)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.5279 -0.4195 -0.0260  0.4178  2.2137 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)  
## (Intercept)    8.906e-01  3.600e-01   2.474   0.0175 *
## taxa          -3.463e-01  2.647e-01  -1.308   0.1980  
## percentpublic  1.517e-02  4.471e-01   0.034   0.9731  
## area_x         4.853e-12  3.607e-12   1.345   0.1857  
## nsumemploy     6.771e-07  3.084e-06   0.220   0.8273  
## total_x_x      2.543e-01  9.482e-02   2.682   0.0104 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8096 on 42 degrees of freedom
##   (3 observations deleted due to missingness)
## Multiple R-squared:   0.24,  Adjusted R-squared:  0.1495 
## F-statistic: 2.653 on 5 and 42 DF,  p-value: 0.03585
plot(lm2)

qqPlot(residuals(lm2))

## 31 43 
## 30 41
AIC(lm2)
## [1] 123.5299

Most of this code ^ was taken from regressionmodelforthesis.Rmd


b) who are the organizations partnering with FWS?

a. List of all the organizations involved
PartnerData <- FormatData(RawData) #reload data to avoid sum(colsums) glitch
#kable(table(colSums(PartnerData[,-c(1:3)])))

#also includes number of species each partner is working on
name <- colSums(PartnerData[,-c(1:3)])
name %>% kable()
x
USFS 18
USFWS 42
BLM 22
Montana Department of Fish Wildlife and Parks 1
Kentucky Ecological Services Field Office 1
Owner of the Beaver Cave property 1
Kentucky Department of Fish and Wildlife Resources 2
NRCS 4
Farm Service Agency 2
Kentucky State Nature Preserves Commission 1
Kentucky Division of Forestry 1
California Department of Fish and Game 6
California Department of Forestry and Fire Protection 1
California Department of Parks and Recreation 2
Center for Natural Lands Management 1
City of Riverside Park and Recreation Department 1
Metropolitan Water District 1
Riverside County Environmental Programs Department 1
Riverside County Habitat Conservation Agency 1
Riverside County Regional Park and Open-Space District 1
Riverside Land Conservancy 1
San Diego State University Field Stations Program 1
The Nature Conservancy 3
University of California Riverside 1
US Navy 2
US Marine Corps 2
US Customs and Border Protection 1
California State Parks 2
Mississippi Army National Guard 1
Mississippi Department of Wildlife Fisheries and Parks 1
Nevada Department of Conservation and Natural Resources 1
Clark County 1
NPS 7
Nevada Department of Wildlife 3
Nevada Department of Transportation 1
Nevada Division of State Parks 2
U.S. Air Force 1
Boulder City 1
Illinois Department of Natural Resources 1
Indiana Department of Natural Resources 1
Kentucky Coal Association 1
Kentucky Coal Country Association 1
Kentucky Farm Bureau 1
Kentucky Natural Resources and Environmental Cabinet 1
Western Kentucky Coal Association 1
Office of Surface Mining Reclamation and Enforcement 1
Utah Division of Parks and Recreation 1
Kane County 1
private landowners of Cow Head Lake Cow Head Slough and California reach of Barrel Creek (four owners all CA signatories) 1
principal permittees on BLM lands within the drainage 1
California and Modoc County Cattlemen’s Associations 1
the California Farm Bureau Federation 1
California Department of Fish and Wildlife Natural Heritage Division Endangered plant program 1
Center of Excellence for Hazardous Materials Management 1
Texas A&M University 1
Texas Comptroller of Public Accounts 1
Texas Interagency Task Force on Economic Growth and Endangered Species 1
Texas Department of Agriculture 1
Texas Parks and Wildlife Department 1
Railroad Commission of Texas 1
University of Texas System - University Lands 1
Texas Farm Bureau 1
Texas Oil and Gas Association 1
Texas Royalty Council 1
Texas and Southwestern Cattle Raisers Association 1
Texas Wildlife Association 1
Texas Association of Business 1
California Department of Fish and Wildlife 3
Nevada Natural Heritage Program 2
Desert Research Institute 2
Anza-Borrego State Park 1
Arizona Game and Fish 1
Ocotillo Wells 1
US Bureau of Reclamation 1
US Naval Air Facility 1
Clemson University 1
Georgia Department of Natural Resources 2
Georgia Department of Transportation 1
Georgia Power 1
Mecklenburg County Park and Recreation North Carolina 1
North Carolina Department of Agriculture & Consumer Services Plant Conservation Program 1
Uintah County 2
Rio Blanco County 2
Utah School and Institutional Trust Lands Administration 2
Utah Governors Public Lands Policy Coordination Office 2
Utah Division of Wildlife Resources 2
Utah Department of Natural Resources 1
Bureau of Reclamation 1
Utah Reclamation Mitigation and Conservation Commission 1
Confederated Tribes of the Goshute Reservation 1
Central Utah Water Conservancy District 1
Southern Nevada Water Authority 1
John Hancock Mutual Life Insurace Company 1
Bob McIntsh (Private landowner) 1
Sierra Pacific Industries 1
Hearest Corporation 1
Siskiyou County Board of Supervisors 1
California Department of Transportation (caltrans) 1
Pacific Gas and Electric 1
Department of Forest and Rangeland Stewardship Colorado State University 1
USGS 1
Mashpee Wampanoag Tribe 1
Lyme Land Conservation Trust 1
American Forest Foundation 1
Woodcock Limited 1
WCS Queens Zoo 1
Wells National Esturarine Research Reserve 1
Roger Williams Park Zoo 1
Audubon Connecticut 1
Connecticut Audubon Society 1
Open Space Institute 1
Audubon New York 1
Quail Forever 1
Pheasants Forever 1
Doris Duke Charitable Foundation 1
Wildlife Conservation Society 1
Amrican Bird Conservancy 1
Quality Deer Management Association 1
Sustainable Forestry Initiative 1
White Memorial Foundation 1
National Fish and Wildlife Foundation 1
Ruffed Grouse Society/American Woodcock Society 1
National Wild Turkey Federation 1
Wildlife Management Institute 1
New Engalnd Cottontail Conservation Initiative 1
Northeast Forest and Fire Management 1
Lyme Timber Company 1
Monterey Preservation Land Trust 1
Narrow River Land Trust 1
Nantucket Conservation Foundation 1
Scarborough Land Trust 1
Avalonia Land Conservancy 1
Orenda Wildlife Land Trust 1
Trustees of Reservations 1
Berkshire Natural Resources Council 1
York Land Trust 1
Becket Land Trust 1
Trust for Public Land 1
Massachusetts National Guard 1
New York Division of Fish Wildlife and Marine Resources 1
Northeast Association of Fish and Wildlife Agencies 1
Rhode Island Division of Fish and Wildlife 1
Connecticute Department of Energy and Environmental Protection 1
Massachusetts Division of Fisheries and Wildlife 1
New Hampshire Fish and Game Department 1
Maine Department of Inland Fisheries and Wildlife 1
University of Rhode Island College of Envionment and Life Sciences 1
University of New Hampshire 1
University of New Hampshire Cooperative Extenson 1
City of NCCP 1
City of Carlsbad 1
City of San Marcos 1
California Resources Agency 1
Arizona Game and Fish Department 2
Colorado Parks and Wildlife 1
New Mexico Department of Game and Fish 1
Mescalero Apache Nation 1
Jicarilla Apache Nation 1
Taos Pueblo 1
Trout Unlimited 1
Vermejo Park Ranch 1
Colorado Division of Parks and Wildlife 1
Otero County 1
Village of Cloudcroft 1
Newhall Land Farming Company 1
San Diego Gas and Electric Company 1
North Carolina Wildlife Resources Commission 1
Duke Energy Carolinas 1
Eastern Band of Cherokee Indians 1
Tennessee Valley Authority 1
Office of Species Conservation 1
Idaho Department of Fish and Game 2
Idaho Department of Lands 1
Idaho Army National Guard 1
Nongovernmental Cooperator Representative 1
US Air Force 1
Nevada Division of Wildlife 1
Idaho Governor’s Office of Species Conservation 1
Soulen Livestock Company Inc. (Soulen Livestock) 1
California State Lands Commission 1
California Tahoe Conservancy 1
League to Save Lake Tahoe 1
Nevada Division of Forestry 1
Nevada Division of State Lands 1
Tahoe Lakefront Owners’s Association 1
Tahoe Regional Planning Agency 1
USDA 1
Montana State University 1
Montana Fish Wildlife and Parks 1
Foster Creek Conservation District 1
Threemile Canyon Farms 1
Portland General Electric 1
Washington State Department of Fish and Wildlife 1
Washington State Department of Natural Resources 1
University of Hawaii 1
Utah State Office 1
Alcoa Power Generating Inc 1
 # kable("html") %>% 
 # kable_styling(font_size = 7)
b. Which partners are most commonly working with others
  • i) Histogram of partnerships/partner
#going to convert all values that are greater than 1 to one so not double counting 
# then subtract diagonal 
#then add 
pdata <- adjmatrix
newmat <- pdata[1,]
newmat[,1] <- "empty" 
#add empty row with same no of columns to combine with current matrix 
newMatrix <- rbind(newmat, pdata)

newMatrix <- newMatrix %>% mutate_if(is.numeric, ~1 * (. != 0))

diag(newMatrix)=0
#delete top (empty) row 
newMatrix <- newMatrix[-1,] ##I think this matrix is worth keeping (all numers now 1s and 0s )
asnum <- newMatrix[,-c(1)]
names <- newMatrix[,c(1)]
rssum <- rowSums(newMatrix[,c(-1)])
newdf <- cbind(names, rssum) ## yay I did it!! 

## now going to try and filter

explore <- newdf 
explore <- explore %>% arrange(-rssum)
(explore[c(1:4),])
##                                       X1 rssum
## 1                                  USFWS   173
## 2                                    BLM    93
## 3                                   USFS    63
## 4 Natural Resources Conservation Service    56
#still feeds mis-represented 

#summary count of the number of partnership for each partner (how many partner have 2 parnters, 3 etc? ) 
ggplot(data = explore, mapping = aes(x=rssum)) + geom_bar() + scale_x_continuous(name ="Number of Partners", limits=c(-1,175)) + scale_y_continuous(name ="Frequency", limits=c(0,50))

#because data is scewed with giant outlier, have recalibrated to show what majority of the data looks like 
filt <- explore %>% filter(rssum < 50) 
ggplot(data = filt, mapping = aes(x=rssum)) + geom_bar() + scale_x_continuous(name ="Number of Partners", limits=c(-1,40)) + scale_y_continuous(name ="Frequency", limits=c(0,20))

  • ii) Max, min, median, and 1st and 3rd quantiles partners/partner
summary(explore)
##       X1                rssum       
##  Length:198         Min.   :  0.00  
##  Class :character   1st Qu.:  6.00  
##  Mode  :character   Median : 12.00  
##                     Mean   : 21.77  
##                     3rd Qu.: 50.00  
##                     Max.   :173.00
c. How many species each partner is involved in projects on
  • i) Histogram of species /partner
  • ii) Max, min, median, and 1st and 3rd quantiles species/partner
## 
##   1   2   3   4   6   7  18  22  42 
## 171  17   3   1   1   1   1   1   1
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   1.000   1.000   1.000   1.589   1.000  42.000
d. Network representation of who works with who. Partners as nodes (weighted by number of species they work on), edges between partners weighted by number of species they work on together

Gwen’s note - or use choard diagram - I don’t think this lends itself to a high number of “nodes” ** Need to look at network documentation to find out how to set up from matrix (everything I’ve read so far has been based on lists)

current code source -https://www.mjdenny.com/Preparing_Network_Data_In_R.html


c) Does the number of actions relate to species characteristics?

a. Same as the number of partners analysis, but with number of actions as the response variable
  • note to self - this might not knit if vector is produced lower in script so repeating code here to find no actions per species
  • decided to try two new predictors (the addition is the number of actions increased by each partner doing that action (might make sense to change this to a weighted value later on))
####copied text to find new variable 
alldf <- sdata #change variable name 

#species <- select(alldf, Scientific.name, X1..Land.Water.Management ,X2..Species.Management, X3..Awareness.raising,X4..law.enforcement.and.prosecution,X5..livelihood..economic.and.moral.incentrives, X6..Conservation.Design.and.Planning,X7..Legal.and.Policy.frameworks,X8..Research.and.monitoring, X9..Education.and.Training, X10..Institutional.Development,funding) #selecting all relevant columns


species <- alldf[,c(2,10:20)] #unable to knit with select so indexed 

eachsp <- species %>% group_by(Scientific.name) %>% summarise_each(funs(sum)) 
## Warning: funs() is soft deprecated as of dplyr 0.8.0
## Please use a list of either functions or lambdas: 
## 
##   # Simple named list: 
##   list(mean = mean, median = median)
## 
##   # Auto named with `tibble::lst()`: 
##   tibble::lst(mean, median)
## 
##   # Using lambdas
##   list(~ mean(., trim = .2), ~ median(., na.rm = TRUE))
## This warning is displayed once per session.
onesandzeros <- eachsp %>% mutate_if(is.numeric, ~1 * (. > 0)) #changed all values back to ones and zeros

noactionswithpartner <- eachsp %>% mutate(actpartsum = rowSums(onesandzeros[,c(2:12)]))

totalno <- onesandzeros %>% mutate("actionsum" = rowSums(onesandzeros[,c(2:12)])) #added column "actionsum" that took the row sums for each species to give count of how many actions each species receives 


#add to regression predictor df with join()

newregdata <- RegData #doesn't work because nothing to join by 
modtdata <- tdata

#first going to join totalno and noactionswithpartner

predictors <- noactionswithpartner %>% full_join(totalno, by = "Scientific.name") #join two new predictors together
#predictors1 <- predictors %>% select(Scientific.name, actionsum, actpartsum)
predictors1 <- predictors[,c(1,13,25)] #unable to knit with select so indexed 
predictors2 <- rename(predictors1, scientific_name = Scientific.name) #renamed predictors vector so would have something to join by 

#so if select values that don't change from tdata with somes that weren't modified in RegData, should be able to join
#modtdata <- modtdata %>% select(scientific_name, total_x_x, area_x) #select relevant predictors from tdata 
modtdata <- modtdata[,c(3,26,32)] #unable to knit with select so indexed 
newregdata <- newregdata %>% left_join(modtdata, by = c("total_x_x", "area_x")) #note 1 name is missing, but will see if that is an issue based on names in action dataset 

# now going to join with predictors 

regdf <- predictors2 %>% left_join(newregdata, by = "scientific_name")
#so issue now isn't the missing name, but the number of species that have missing na values 

(which(is.na(regdf$total_x_x))) #so have 7 missing values? 
## [1]  7 15 36
# Moxostoma - in my dataset name is "Moxostoma sp 2"
# Erigeron basalticus - no information in PartnerData but should be in dataset****

###should remove these species from cleaning_salafsky script 
# Thymallus arcticus had issue with subspecies - should be NA
# Dalea tentaculoides ^^ - document was not for partnership
# Cymopterus deserticola ^ - even though link is different for Detailed_methods and salasfkycoding (AND both are broken) I think the main difference is Vol1 vs Vol2 - have indicated that Vol1 was wrong year so remove
# Cordylanthus nidularius ^ - I think this should also be removed (from book not ca)
# Calochortus persistens - year is wrong here, should have been removed from S dataset 

regdf <- regdf[-which(is.na(regdf$taxa)),] # here i deleted all the rows with an NA
#regdf <- regdf %>% select(-c(X1, total, scientific_name))#remove column X1
regdf <- regdf[,-c(1,4,5)] #unable to knit with select so indexed 
#reorder so that predictors are at end of df 
regdf <- regdf[,c(3:7,1,2)] 

#lets try this first without the repetition from partners

#regdf1 <- regdf %>% select(-c(actpartsum))

regdf1 <- regdf[,-c(6)]

############### 


#set up
PredictorsOnlyPixel <- regdf1[,c(1)]
PredictAndResponsePixel <- regdf1
PredictAndResponseGrid <- regdf1
  
# put histograms on the diagonal panel  
panel.hist <- function (x,...)                  # define a function that says what we want to plot in the diagonal
{
  usr <- par("usr"); on.exit(par(usr))          # not sure what usr is for?
  par(usr = c(usr[1:2],0,1.5))
  h <- hist(x, plot = FALSE)
  breaks <- h$breaks; nB <- length(breaks)          # make the hist 
  y <- h$counts; y <- y/max(y)
  rect(breaks[-nB], 0, breaks[-1], y, col="grey", ...)  # defines what the histogram is going to look like
}

# put correlations on the upper panels,
panel.cor <- function(x, y, digits=2, prefix="", cex.cor, ...)
{
  usr <- par("usr"); on.exit(par(usr))
  par(usr = c(0, 1, 0, 1))
  r <- cor(x, y,use="everything")               
  txt <- format(c(r, 0.123456789), digits=digits)[1]
  prefix <- "r = "
  rc <- cor.test(x,y,method = c("pearson"))             ## calculate pearsons rho for upper grid
  txt <- paste(prefix,txt,sep="")
  text(0.5, 0.5, txt, cex = 1)
}

## plot a correlation matrix plot that uses the functions specified above to say what to plot where
      ## this was taken directly from website and still not plotting r values for all 
#pairs(PredictAndResponsePixel[1:6], lower.panel=panel.smooth, cex = .8, diag.panel=panel.hist, cex.labels = 1.2, font.labels=2, upper.panel=panel.cor)

pairs(PredictAndResponsePixel,lower.panel = panel.smooth, diag.panel=panel.hist,upper.panel=panel.cor)

# so still having some issues with getting r values to print out 

######### VIFs

vif(lm(actionsum ~ area_x +percentpublic + taxa + nsumemploy + total_x_x,data = regdf1))
##        area_x percentpublic          taxa    nsumemploy     total_x_x 
##      1.070998      1.295225      1.407448      1.103866      1.136759
#looks good 


######## Run the model for actionsum 

lm_actionsum <- lm(actionsum ~ area_x +percentpublic + taxa + nsumemploy + total_x_x,data = regdf1)
summary(lm_actionsum)
## 
## Call:
## lm(formula = actionsum ~ area_x + percentpublic + taxa + nsumemploy + 
##     total_x_x, data = regdf1)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.5029 -1.3443  0.0087  1.1988  4.7766 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    4.164e+00  8.884e-01   4.687 6.53e-05 ***
## area_x         1.615e-11  1.021e-11   1.582    0.125    
## percentpublic  1.370e+00  1.382e+00   0.991    0.330    
## taxa          -1.109e+00  8.748e-01  -1.268    0.215    
## nsumemploy     1.015e-05  1.140e-05   0.890    0.381    
## total_x_x      1.139e-01  2.745e-01   0.415    0.681    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.09 on 28 degrees of freedom
##   (2 observations deleted due to missingness)
## Multiple R-squared:  0.1819, Adjusted R-squared:  0.03583 
## F-statistic: 1.245 on 5 and 28 DF,  p-value: 0.3148
#lose significance of threats
#plot(lm_actionsum)




################### other predictor


#lets try this first without the repetition from partners

#regdf2 <- regdf %>% select(-c(actionsum))
regdf2 <- regdf[,-c(7)]

#set up
PredictorsOnlyPixel <- regdf2[,c(1)]
PredictAndResponsePixel <- regdf2
PredictAndResponseGrid <- regdf2
  
# put histograms on the diagonal panel  
panel.hist <- function (x,...)                  # define a function that says what we want to plot in the diagonal
{
  usr <- par("usr"); on.exit(par(usr))          # not sure what usr is for?
  par(usr = c(usr[1:2],0,1.5))
  h <- hist(x, plot = FALSE)
  breaks <- h$breaks; nB <- length(breaks)          # make the hist 
  y <- h$counts; y <- y/max(y)
  rect(breaks[-nB], 0, breaks[-1], y, col="grey", ...)  # defines what the histogram is going to look like
}

# put correlations on the upper panels,
panel.cor <- function(x, y, digits=2, prefix="", cex.cor, ...)
{
  usr <- par("usr"); on.exit(par(usr))
  par(usr = c(0, 1, 0, 1))
  r <- cor(x, y,use="everything")               
  txt <- format(c(r, 0.123456789), digits=digits)[1]
  prefix <- "r = "
  rc <- cor.test(x,y,method = c("pearson"))             ## calculate pearsons rho for upper grid
  txt <- paste(prefix,txt,sep="")
  text(0.5, 0.5, txt, cex = 1)
}

## plot a correlation matrix plot that uses the functions specified above to say what to plot where
      ## this was taken directly from website and still not plotting r values for all 
#pairs(PredictAndResponsePixel[1:6], lower.panel=panel.smooth, cex = .8, diag.panel=panel.hist, cex.labels = 1.2, font.labels=2, upper.panel=panel.cor)

pairs(PredictAndResponsePixel,lower.panel = panel.smooth, diag.panel=panel.hist,upper.panel=panel.cor)

# so still having some issues with getting r values to print out 

######### VIFs

vif(lm(actpartsum ~ area_x +percentpublic + taxa + nsumemploy + total_x_x,data = regdf2))
##        area_x percentpublic          taxa    nsumemploy     total_x_x 
##      1.070998      1.295225      1.407448      1.103866      1.136759
#looks good 


######## Run the model for actpartsum 

lm_actpartsum <- lm(actpartsum ~ area_x +percentpublic + taxa + nsumemploy + total_x_x,data = regdf2)
summary(lm_actpartsum)
## 
## Call:
## lm(formula = actpartsum ~ area_x + percentpublic + taxa + nsumemploy + 
##     total_x_x, data = regdf2)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -3.5029 -1.3443  0.0087  1.1988  4.7766 
## 
## Coefficients:
##                 Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    4.164e+00  8.884e-01   4.687 6.53e-05 ***
## area_x         1.615e-11  1.021e-11   1.582    0.125    
## percentpublic  1.370e+00  1.382e+00   0.991    0.330    
## taxa          -1.109e+00  8.748e-01  -1.268    0.215    
## nsumemploy     1.015e-05  1.140e-05   0.890    0.381    
## total_x_x      1.139e-01  2.745e-01   0.415    0.681    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 2.09 on 28 degrees of freedom
##   (2 observations deleted due to missingness)
## Multiple R-squared:  0.1819, Adjusted R-squared:  0.03583 
## F-statistic: 1.245 on 5 and 28 DF,  p-value: 0.3148
#lose significance of threats
#plot(lm_actpartsum)

lm_actpartsum <- lm(log(actpartsum) ~ area_x +percentpublic + taxa + nsumemploy + total_x_x,data = regdf2)
#still nothing 


d) How are actions distributed across partners?

From a planning perspective, we care about actions because decision makers need to know a) what needs to be done for species recovery, and b) which actors can best do it. Thus, we want to know “what actions are different partners doing and how do they contribute to the sum total of what needs to be done for different species?”



Data Exploration

a. Frequency distribution (histogram) of how many partners are participating in each type of action
df <- as_tibble(sdata)
df[,c(10:20)] <- sapply(df[ ,c(10:20)], as.numeric)

#rowSums((code2[,c(11:21)]))

colsum <- (as.data.frame(colSums(df[,c(10:20)])) #creating dataframe so can plot
  %>% rownames_to_column()) #making sure that dataframe has rownames to set as x and y 
 
  
colsum <- colsum  %>% rename(count = `colSums(df[, c(10:20)])`) #renaming column produced by colsums
  

#ggplot(colsum) + geom_point(mapping = aes(x = rowname, y = count))

ggplot(colsum) + geom_bar(mapping = aes(x = rowname, y = count), stat = "identity")+ theme(axis.text.x = element_text(angle = 90)) + scale_x_discrete(name ="Name of Action")

#same graph, just expanded with text shifted
ggplot(colsum) + geom_bar(mapping = aes(x = rowname, y = count), stat = "identity")+ theme(axis.text.x = element_text(angle = 30))  + scale_x_discrete(name ="Name of Action")

b. For each species
  • i) For each species: how many different actions are applied (and how many of each action).
# loop 
  #filter by scientific names
  # For each unique value 
  # column sum
  #table output of each? (??)

alldf <- df #change variable name 
#colnames(alldf[,c(10:20)])

species <- select(alldf, Scientific.name, X1..Land.Water.Management ,X2..Species.Management, X3..Awareness.raising,X4..law.enforcement.and.prosecution,X5..livelihood..economic.and.moral.incentrives, X6..Conservation.Design.and.Planning,X7..Legal.and.Policy.frameworks,X8..Research.and.monitoring, X9..Education.and.Training, X10..Institutional.Development,funding) #selecting all relevant columns

#Make work as numeric
### this works


eachsp <- species %>% group_by(Scientific.name) %>% summarise_each(funs(sum)) 

#eachsp #for each species, summed actions done by each partner
kable(eachsp) #prints out weird # this is actually working in markdown
Scientific.name X1..Land.Water.Management X2..Species.Management X3..Awareness.raising X4..law.enforcement.and.prosecution X5..livelihood..economic.and.moral.incentrives X6..Conservation.Design.and.Planning X7..Legal.and.Policy.frameworks X8..Research.and.monitoring X9..Education.and.Training X10..Institutional.Development funding
Aliciella caespitosa 1 0 0 0 0 5 1 3 5 0 0
Allium gooddingii 2 1 0 0 0 0 2 1 1 3 0
Astragalus anserinus 1 1 0 0 0 0 0 0 0 0 2
Castilleja christii 2 1 2 0 0 2 2 2 1 2 1
Cicindela albissima 2 1 0 0 0 0 0 0 1 0 2
Cimicifuga arizonica 1 1 0 0 0 1 1 2 1 2 1
Erigeron basalticus 0 1 0 0 0 1 0 1 0 0 0
Euphydryas anicia cloudcrofti 2 1 0 0 0 0 0 2 1 1 4
Fallicambarus gordoni 2 0 2 0 1 1 0 1 0 1 0
Horkelia hendersonii 1 1 0 0 0 0 0 0 1 1 1
Iotichthys phlegethontis 6 3 0 1 0 4 1 4 0 8 1
Lepidium papilliferum 1 3 0 0 0 0 0 1 0 0 0
Lithobates onca 2 1 0 0 2 1 2 2 1 3 2
Lupinus aridus ssp. ashlandensis 1 1 0 0 0 0 0 2 0 1 1
Moxostoma 2 5 0 0 0 1 0 5 0 4 3
Nerodia erythrogaster neglecta 2 0 1 1 0 3 1 0 1 0 0
Oncorhynchus clarkii virginalis 4 2 0 0 0 0 0 2 0 4 0
Oncorhynchus mykiss aquilarum 3 2 1 0 0 0 0 2 0 0 0
Oncorhynchus mykiss ssp. 2 0 0 0 0 4 1 2 0 1 1
Opuntia X multigeniculata 1 1 0 0 0 1 0 0 0 0 0
Pediocactus paradinei 3 0 0 0 0 0 1 0 0 1 1
Penstemon scariosus albifluvis 1 0 0 0 0 1 0 0 0 1 0
Phacelia stellaris 4 3 0 0 0 0 1 4 1 3 0
Phrynosoma mcallii 5 0 0 2 0 9 1 1 2 2 2
Potentilla basaltica 2 1 0 0 0 1 0 1 1 0 0
Pseudanophthalmus catorycetes 1 0 0 0 0 1 0 0 0 2 0
Pseudanophthalmus inexpectatus 1 0 0 0 0 1 0 0 0 0 0
Pseudanophthalmus pholeter 1 0 0 0 0 1 0 0 0 2 0
Pyrgulopsis morrisoni 2 2 0 0 0 0 0 2 0 1 0
Sceloporus arenicolus 2 0 0 0 1 4 6 4 1 8 5
Solidago plumosa 1 0 1 0 0 0 0 3 0 1 0
Sonorella macrophallus 0 0 0 0 0 2 1 3 1 0 0
Sylvilagus transitionalis 2 0 2 0 0 2 4 0 0 4 0
Thymallus arcticus 5 0 0 0 0 6 3 3 0 4 2
Urocitellus endemicus 1 4 0 0 0 3 2 2 0 1 4
Urocitellus washingtoni 9 1 3 0 1 9 7 10 5 12 3
Zaitzevia thermae 1 1 1 0 0 2 0 0 1 1 4
#get count of total number of actions for each 
onesandzeros <- eachsp %>% mutate_if(is.numeric, ~1 * (. > 0)) #changed all values back to ones and zeros
totalno <- onesandzeros %>% mutate(actionsum = rowSums(onesandzeros[,c(2:12)])) #added column "actionsum" that took the row sums for each species to give count of how many actions each species receives 
kable(totalno) #this prints out weird # kable is actually working in markdown
Scientific.name X1..Land.Water.Management X2..Species.Management X3..Awareness.raising X4..law.enforcement.and.prosecution X5..livelihood..economic.and.moral.incentrives X6..Conservation.Design.and.Planning X7..Legal.and.Policy.frameworks X8..Research.and.monitoring X9..Education.and.Training X10..Institutional.Development funding actionsum
Aliciella caespitosa 1 0 0 0 0 1 1 1 1 0 0 5
Allium gooddingii 1 1 0 0 0 0 1 1 1 1 0 6
Astragalus anserinus 1 1 0 0 0 0 0 0 0 0 1 3
Castilleja christii 1 1 1 0 0 1 1 1 1 1 1 9
Cicindela albissima 1 1 0 0 0 0 0 0 1 0 1 4
Cimicifuga arizonica 1 1 0 0 0 1 1 1 1 1 1 8
Erigeron basalticus 0 1 0 0 0 1 0 1 0 0 0 3
Euphydryas anicia cloudcrofti 1 1 0 0 0 0 0 1 1 1 1 6
Fallicambarus gordoni 1 0 1 0 1 1 0 1 0 1 0 6
Horkelia hendersonii 1 1 0 0 0 0 0 0 1 1 1 5
Iotichthys phlegethontis 1 1 0 1 0 1 1 1 0 1 1 8
Lepidium papilliferum 1 1 0 0 0 0 0 1 0 0 0 3
Lithobates onca 1 1 0 0 1 1 1 1 1 1 1 9
Lupinus aridus ssp. ashlandensis 1 1 0 0 0 0 0 1 0 1 1 5
Moxostoma 1 1 0 0 0 1 0 1 0 1 1 6
Nerodia erythrogaster neglecta 1 0 1 1 0 1 1 0 1 0 0 6
Oncorhynchus clarkii virginalis 1 1 0 0 0 0 0 1 0 1 0 4
Oncorhynchus mykiss aquilarum 1 1 1 0 0 0 0 1 0 0 0 4
Oncorhynchus mykiss ssp. 1 0 0 0 0 1 1 1 0 1 1 6
Opuntia X multigeniculata 1 1 0 0 0 1 0 0 0 0 0 3
Pediocactus paradinei 1 0 0 0 0 0 1 0 0 1 1 4
Penstemon scariosus albifluvis 1 0 0 0 0 1 0 0 0 1 0 3
Phacelia stellaris 1 1 0 0 0 0 1 1 1 1 0 6
Phrynosoma mcallii 1 0 0 1 0 1 1 1 1 1 1 8
Potentilla basaltica 1 1 0 0 0 1 0 1 1 0 0 5
Pseudanophthalmus catorycetes 1 0 0 0 0 1 0 0 0 1 0 3
Pseudanophthalmus inexpectatus 1 0 0 0 0 1 0 0 0 0 0 2
Pseudanophthalmus pholeter 1 0 0 0 0 1 0 0 0 1 0 3
Pyrgulopsis morrisoni 1 1 0 0 0 0 0 1 0 1 0 4
Sceloporus arenicolus 1 0 0 0 1 1 1 1 1 1 1 8
Solidago plumosa 1 0 1 0 0 0 0 1 0 1 0 4
Sonorella macrophallus 0 0 0 0 0 1 1 1 1 0 0 4
Sylvilagus transitionalis 1 0 1 0 0 1 1 0 0 1 0 5
Thymallus arcticus 1 0 0 0 0 1 1 1 0 1 1 6
Urocitellus endemicus 1 1 0 0 0 1 1 1 0 1 1 7
Urocitellus washingtoni 1 1 1 0 1 1 1 1 1 1 1 10
Zaitzevia thermae 1 1 1 0 0 1 0 0 1 1 1 7
# totalno  #this is working 
  • ii) Action richness and diversity for each species? How is this measured?

  • iii) How many partners does each species have and do partners conduct the same or different actions from each other

nopartners <- (alldf %>% group_by(Scientific.name)   #selecting  each species
%>% distinct(partner.in.agreement)    #count how many partners are distinct 
%>% summarise(n()))                   # Count the number of distinct 

kable(nopartners)
Scientific.name n()
Aliciella caespitosa 6
Allium gooddingii 3
Astragalus anserinus 2
Castilleja christii 2
Cicindela albissima 9
Cimicifuga arizonica 3
Erigeron basalticus 2
Euphydryas anicia cloudcrofti 4
Fallicambarus gordoni 6
Horkelia hendersonii 4
Iotichthys phlegethontis 8
Lepidium papilliferum 3
Lithobates onca 4
Lupinus aridus ssp. ashlandensis 2
Moxostoma 6
Nerodia erythrogaster neglecta 3
Oncorhynchus clarkii virginalis 4
Oncorhynchus mykiss aquilarum 5
Oncorhynchus mykiss ssp. 12
Opuntia X multigeniculata 1
Pediocactus paradinei 3
Penstemon scariosus albifluvis 1
Phacelia stellaris 5
Phrynosoma mcallii 21
Potentilla basaltica 5
Pseudanophthalmus catorycetes 3
Pseudanophthalmus inexpectatus 1
Pseudanophthalmus pholeter 3
Pyrgulopsis morrisoni 4
Sceloporus arenicolus 11
Solidago plumosa 3
Sonorella macrophallus 5
Sylvilagus transitionalis 5
Thymallus arcticus 6
Urocitellus endemicus 4
Urocitellus washingtoni 14
Zaitzevia thermae 6
#part two of question 
- Can find number of partners but unsure how to answer second part of Q
  - "Do partners conduct same or different actions.." for same partner working on each species or across different species? 
  • iv) How many threats does each species face
    - information found in regression prep script
allspindf <- eachsp
new <- tdata #need to join with tdata (renamed here)
###new$scientific_name
#select relevant columns
tthreats <- select(new, scientific_name, hab_x_x:threats_addressed_by_conservation_x_x)

###tthreats$scientific_name
#need to join with tdata 
#to do so need to change col name so match 

tthreats <- rename(tthreats, Sciname = scientific_name)
allspindf <- rename(allspindf, Sciname = Scientific.name)

threats <- left_join(allspindf, tthreats, join_by = Sciname)
## Joining, by = "Sciname"
#total number of threats for each sp
threats$total_x_x
##  [1]  2  1  1  2  2  1  0  5  1  0  4  4  4  0 NA  5  4  3  1  0  3  4  3
## [24]  2  2  1  1  1  3  3  2  1  3 NA  2  1  2
#(missing info for 4 sp) [already checked plustwo dataset which was joined when did work for regression]

kable(head(threats)) #printing out top 6 lines of code in table
Sciname X1..Land.Water.Management X2..Species.Management X3..Awareness.raising X4..law.enforcement.and.prosecution X5..livelihood..economic.and.moral.incentrives X6..Conservation.Design.and.Planning X7..Legal.and.Policy.frameworks X8..Research.and.monitoring X9..Education.and.Training X10..Institutional.Development funding hab_x_x over_x_x poll_x_x spsp_x_x env_x_x demo_x_x total_x_x threats_addressed_by_conservation_x_x
Aliciella caespitosa 1 0 0 0 0 5 1 3 5 0 0 1 0 0 0 0 1 2 NA
Allium gooddingii 2 1 0 0 0 0 2 1 1 3 0 1 0 0 0 0 0 1 Habitat modification
Astragalus anserinus 1 1 0 0 0 0 0 0 0 0 2 0 0 0 1 0 0 1 Species-Species interaction
Castilleja christii 2 1 2 0 0 2 2 2 1 2 1 1 0 0 1 0 0 2 NA
Cicindela albissima 2 1 0 0 0 0 0 0 1 0 2 1 1 0 0 0 0 2 Habitat modification
Cimicifuga arizonica 1 1 0 0 0 1 1 2 1 2 1 1 0 0 0 0 0 1 Habitat modification
c. Graphs
  • i) For each action – how many species receive that action
# take column sum of onesandzeros dataset created above (chunk b. For each species - i.. )
actionsums <- colSums(onesandzeros[,c(2:12)]) #add all species for which that action happened

actionsums <- (as.data.frame(actionsums) #had to change to df
               %>% rownames_to_column() #moving rownames to columns
               %>% rename(count = actionsums)) #renaming column produced by colsums

kable(actionsums)
rowname count
X1..Land.Water.Management 35
X2..Species.Management 22
X3..Awareness.raising 8
X4..law.enforcement.and.prosecution 3
X5..livelihood..economic.and.moral.incentrives 4
X6..Conservation.Design.and.Planning 24
X7..Legal.and.Policy.frameworks 17
X8..Research.and.monitoring 25
X9..Education.and.Training 16
X10..Institutional.Development 26
funding 18
ggplot(actionsums) + geom_bar(mapping = aes(x = rowname, y = count), stat = "identity") + theme(axis.text.x = element_text(angle = 90)) + scale_x_discrete(name ="Name of Action")

#same graph, just expanded with text shifted
ggplot(colsum) + geom_bar(mapping = aes(x = rowname, y = count), stat = "identity")+ theme(axis.text.x = element_text(angle = 30))  + scale_x_discrete(name ="Name of Action")

  • ii) For each action – how many times is it applied total (counting each speciesXpartner separately) information contained in csum output
## for all speciesXpartners csum but this includes some repetition ("partners" which are actually multiple partners working on same action)
csum <- (colSums(sdata[,c(10:20)]))

#check csum
check <- colSums(eachsp[,c(2:12)])
#yes get the same values 

#redo csum calculation and take out = M

##
noM <- sdata[-which(sdata$type.of.partners == "M"),]
newcsum <- (colSums(noM[,c(10:20)]))

newcsum <- as.data.frame(actionsums) #had to change to df
               
ggplot(newcsum) + geom_bar(mapping = aes(x = rowname, y = count), stat = "identity") + theme(axis.text.x = element_text(angle = 90)) + scale_x_discrete(name ="Name of Action")

#same graph, just expanded with text shifted
ggplot(newcsum) + geom_bar(mapping = aes(x = rowname, y = count), stat = "identity")+ theme(axis.text.x = element_text(angle = 30))  + scale_x_discrete(name ="Name of Action")

  • iii )For each action – how many partners apply that action to at least 1 species
partners <- select(alldf, partner.in.agreement, Scientific.name, X1..Land.Water.Management ,X2..Species.Management, X3..Awareness.raising,X4..law.enforcement.and.prosecution,X5..livelihood..economic.and.moral.incentrives, X6..Conservation.Design.and.Planning,X7..Legal.and.Policy.frameworks,X8..Research.and.monitoring, X9..Education.and.Training, X10..Institutional.Development,funding) #selecting all relevant columns

class(partners$funding)
## [1] "numeric"
#so summarise will only accept 1 value per group so going to try and do for each group 
#there is definetly a more elegant/better way to do this but this works
x1 <- partners %>% group_by(partner.in.agreement) %>% summarise(sum(partners[,c(3)])) #84
x2 <- partners %>% group_by(partner.in.agreement) %>% summarise(sum(partners[,c(4)])) #45
x3 <- partners %>% group_by(partner.in.agreement) %>% summarise(sum(partners[,c(5)])) #13
x4 <- partners %>% group_by(partner.in.agreement) %>% summarise(sum(partners[,c(6)])) #4
x5 <- partners %>% group_by(partner.in.agreement) %>% summarise(sum(partners[,c(7)])) #5   
x6 <- partners %>% group_by(partner.in.agreement) %>% summarise(sum(partners[,c(8)])) #68
x7 <- partners %>% group_by(partner.in.agreement) %>% summarise(sum(partners[,c(9)])) #38
x8 <- partners %>% group_by(partner.in.agreement) %>% summarise(sum(partners[,c(10)])) #72
x9 <- partners %>% group_by(partner.in.agreement) %>% summarise(sum(partners[,c(11)])) #26
x10 <- partners %>% group_by(partner.in.agreement) %>% summarise(sum(partners[,c(12)])) #75
x11 <- partners %>% group_by(partner.in.agreement) %>% summarise(sum(partners[,c(13)])) #41
## (for each row, value is to the right)

partnerapptosp <- as.data.frame(c(x1, x2, x3, x4,x5, x6,x7,x8,x9,x10,x11))

#condense into one df 
partnerapptosp <- partnerapptosp[,-c(1,3,5,7,9,11,13,15,17,19,21)] #remove relplicated columns (and don't need to know partner ids)
#colnames(partnerapptosp)
setnames(partnerapptosp, old = c('sum.partners...c.3...', 'sum.partners...c.4...', 'sum.partners...c.5...', 'sum.partners...c.6...', 'sum.partners...c.7...', 'sum.partners...c.8...', 'sum.partners...c.9...', 'sum.partners...c.10...', 'sum.partners...c.11...', 'sum.partners...c.12..', 'sum.partners...c.13..'),skip_absent=TRUE, new = c('Land.Water.Management' , 'Species.Management', 'Awareness.raising','law.enforcement.and.prosecution', 'livelihoodeconomic', 'ConservationDesign', 'LegalandPolicy', 'ResearchMonitoring', 'Education.and.Training', 'InstitutionalDevelopment','funding')) #for some reason this wouldn't over write c12 or c13 (institutional development and funding) so setting manually below 

partnerapptosp <- partnerapptosp %>% rename(InstitutionalDevelopment = 'sum.partners...c.12...')
partnerapptosp <- partnerapptosp %>% rename(funding = 'sum.partners...c.13...')

partnerapptosp <- partnerapptosp[2,] #only need to select 1 row 
action <- t(partnerapptosp) 
#trying not to loose rownames when convert

actiondf <- data.frame(action = row.names(action),action) #changed to df and set column names 

kable(actiondf)
action X2
Land.Water.Management Land.Water.Management 79
Species.Management Species.Management 38
Awareness.raising Awareness.raising 13
law.enforcement.and.prosecution law.enforcement.and.prosecution 4
livelihoodeconomic livelihoodeconomic 5
ConservationDesign ConservationDesign 66
LegalandPolicy LegalandPolicy 37
ResearchMonitoring ResearchMonitoring 65
Education.and.Training Education.and.Training 25
InstitutionalDevelopment InstitutionalDevelopment 74
funding funding 40
ggplot(actiondf) + geom_bar(mapping = aes(x = action, y = X2), stat = "identity")+ scale_y_continuous(name ="Count")

#same graph with names rotated 
ggplot(actiondf) + geom_bar(mapping = aes(x = action, y = X2), stat = "identity") + theme(axis.text.x = element_text(angle = 90)) + scale_x_discrete(name ="Name of Action") + scale_y_continuous(name ="Count")

#col_list <- partners[,c(3:13)]
#for(coln in col_list){
#  partners %>% group_by(partner.in.agreement) %>% summarise(sum(partners[,coln]))
#} ## Got error 
d. Thoughts about typologies/groupings
  • i) Can we group partners into types based on the actions that they do? (numbers and type of actions?)

  • ii) Can we group partners base on type of organization (fed agency, wildlife agency, private landowner, ngo, researchers?)

  • iii) Can we group species based on the sets of threats that they face?

  • iv) East coast v west coast (makes most sense to focus on west coast split as there are more species here) primary exploration attached below in spatial analysis section

Questions we want to answer

e. How are total actions distributed across partners (e.g., what proportion of land management is enacted by FWS?)

Annabelle Spatial analysis

Background to analysis

  • map of centers and ranges (load in pdf) Informative caption Map: All species, for which I have partner data, have their ranges colored in orange and the average center of their range indicated with a green dot. Ranges may be overlapping and were calculated at a county scale.

west coast v east coast #### variable okay produced in RegressionPredictoExploration range chunk removing all 0s columns not working

 combo <- read.csv(paste0("/usr/local/bin/store/partner_rff/OutputAndInputsForThesis/combodf.csv"))
center <- read_excel(paste0(DataSource, "/mean_center_range_subsetofprecluded_with_partnerdata.xlsx"))

#### correct values here 
ugh <- combo
#ugh <- ugh[,c(2,6)] #had to re calibrate when reloaded the data
okay <- ugh %>% left_join(center, by = "scientific_name")
#so have slight issue where Chorizanthe parryi var fernandina is splet Chorizanthe parryi var fernandina and Chorizanthe parryi var. fernandina (period after var.)
#okay[53,2] <- 2

okay <- okay[-which(is.na(okay$XCoord)),]


#plot(okay$x, okay$y)

## divide by xcoord value to make east coast v westcoast column 

okay <- okay %>% mutate(geo = x < -95)
summary(okay$geo) #adams cave beetle have same point which is why getting 10 but only see 9 points
okay[which(okay$geo == FALSE),9] <- 0
okay[which(okay$geo == TRUE),9] <- 1
## true or 1 means species is westcoast

west <- okay %>% filter(geo == TRUE)
fullwest <- west %>% left_join(full)

east <- okay %>% filter(geo == FALSE)
fulleast <- east %>% left_join(full)

west <- okay %>% filter(geo == TRUE)   #### ***need to find what geo is - I think this was where made e/w divide but idk where in previous code that was 
west <- west [,c(1,3)]
west <- west %>% left_join(PartnerData, by = "scientific_name")
nzwest <- west[,-c(1:4)][,colSums(west[,-c(1:4)])>=1] #removed columns with all 0s
dim(west)
nzwest <- as.matrix(nzwest) #prep for making adj matrix
wajmat <- t(nzwest) %*% nzwest #new adjmat
tidyweajmat <- as_tibble(wajmat, rownames = "id")


east <- okay %>% filter(geo == FALSE)
east <- east [,c(1,3)]
east <- east %>% left_join(PartnerData, by = "scientific_name") 
nzeast <- east[,-c(1:4)][,colSums(east[,-c(1:4)])>=1]
dim(nzeast)
nzeast <- as.matrix(nzeast) #prep for making adj matrix
eajmat <- t(nzeast) %*% nzeast #new adjmat
#need to change from matrix to tidyr 
#need to change to tidyverse for later code
tidyeajmat <- as_tibble(eajmat, rownames = "id")


#this is just the diagonal matrix from adj matrix 


graphme <- wajmat
graphme <- graphme %>% melt()
graphme <- graphme %>% filter(Var1 == Var2)
graphme <- graphme[,-c(1)] #removed duplicate names
#all partners
ggplot(data = graphme) + geom_col(mapping = aes(x=Var2, y=value))
g_one <- graphme %>% filter(value >1)
ggplot(data = g_one, mapping = aes(x=Var2, y=value)) + geom_col() + theme(axis.text.x = element_text(angle = 90)) + scale_y_continuous(name ="Number of Species", limits=c(0,45)) + scale_x_discrete(name ="Name of Partner")
  • variograms
 combo <- read.csv(paste0("/usr/local/bin/store/partner_rff/OutputAndInputsForThesis/combodf.csv"))
 center <- read_excel(paste0(DataSource, "/mean_center_range_subsetofprecluded_with_partnerdata.xlsx"))

  
#make changes so can join 
combo$common_name[10] <- "CHRISTS PAINTBRUSH" #the accent on the o gets messed up in this species so set to same
## Warning in `[<-.factor`(`*tmp*`, 10, value = structure(c(NA, 18L, 31L,
## 1L, : invalid factor level, NA generated
center$common_name[9] <- "CHRISTS PAINTBRUSH"
combo$common_name[1] <- "San Fernando Valley spineflower" #loaded without common name 
## Warning in `[<-.factor`(`*tmp*`, 1, value = structure(c(NA, 18L, 31L, 1L, :
## invalid factor level, NA generated
#now join the two df and drop species with data missing 

full <- center %>% left_join(combo, by = "common_name")
## Warning: Column `common_name` joining character vector and factor, coercing
## into character vector
#remove ones without partner data

full <- full[-which(is.na(full$total)),]

#remove extra scientific name column 
full <- full[,-c(7,8)]

#note, only one species was removed due to inadequate information about geographic range (artic greyling)

Generate variogram with raw data - L10 was unable to run as part of markdown script (previously in .r so that might be source of problem) current error is “object logp not found”

Head <- full #made in chunk above
Head <- Head[,c(5,6,9)] #select only necessary columns 
Head <- as.data.frame(Head)
#
Head$logp = log(Head$total)
#
# remove NA values 
### Head <- Head[-which(is.na(Head$total)),]
#
library(sp)
#data(meuse)
#
coordinates(Head) = c("x", "y")
#
library(RColorBrewer)
library(classInt)
library(gstat)
  #     hscat(logp~1,data=Head, breaks=c(0,5,10,15,20,25,30,200))
hscat(logp~1,data=Head, breaks=c(0,1,2,3,5,10,15,20,25,30,35))    ## axes are wrong? Plots look square.. 
#
# How many points are there?
#
length(Head$logp)
#
# How many pairs of points?
#
choose(49,2)
#
# Scatter plot of squared differences
#
plot(variogram(logp~1,Head,cloud=T))
#
# Differences averaged for distance increments
# to show empirical variogram
#
logzinc.vario = variogram(logp~1,Head)
#
# Take a look at the contents of logzinc.vario
#
logzinc.vario
#
# Now plot the empirical variogram
# and identify the number of pairs of points
# going into each estimate
#
plot(gamma~dist,data=logzinc.vario,xlim=c(0,30))
#text(logzinc.vario$dist+100,logzinc.vario$gamma,logzinc.vario$np)
#
# Sometimes outliers can have a big effect
# Note that for this data set one outlier will go into the 
# computation of the squared difference 155 times.
# Cressie suggests a robust measure of the variogram.
#
logzinc.vario.robust = variogram(logp~1,Head,cressie=T)
#
par(mfrow=c(1,2))
plot(gamma~dist,data=logzinc.vario,type="b")
#title("Classical Variogram")
plot(gamma~dist,data=logzinc.vario.robust,type="b") 
#title("Robust Variogram")
par(mfrow=c(1,1))
#
#
# *** this part of the code doesn't work right yet.. don't think the lines would be super relevant for this data anyway? 
#
#
# Do a permutation test by shifting the values associated with
# each location around at random (i.e. permute them)
# and compute the variogram and plot it each time this is done.
#
# First plot the original variogram estimate
#
plot(gamma~dist,data=logzinc.vario,type="n")
#
# Now do 100 permutations and plot the resulting "pure nugget" effect
# Where our variogram lies outside this envelope show significant
# correlation.
#
x=data.frame(Head)$x
y=data.frame(Head)$y
logzinc=Head$logp
id0=seq(length(Head$logp))
for(i in 1:100)
{
  id = sample(id0)
  hold.perm = data.frame(x=x,y=y,logp=logp[id])
  coordinates(hold.perm)=c("x","y")
  hold.vario = variogram(logp~1,hold.perm)
  lines(gamma~dist,data=hold.vario,col=2)
}
lines(gamma~dist,data=logzinc.vario,col=1,type="b")
#
# Examining for Anisotropy
#
logzinc.vario.dir = variogram(logp~1,Head,alpha=c(0,45,90,135))
plot(logzinc.vario.dir)
  • regression exploration
  • Pat Sullivan’s notes - have c&p into word document